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Monday, March 13, 2017 


The OA interviews: Philip Cohen, founder of 
SocArXiv 


(A print version of this interview is available here) 


Fifteen years after the launch of the Budapest Open Access Initiative 
(BOAJ) the OA revolution has yet to achieve its objectives. It does not help 
that legacy publishers are busy appropriating open access, and diluting it in 
ways that benefit them more than the research community. As things stand 
we could end up with a half revolution. 


But could a new development help 
recover the situation? More specifically, 
can the newly reinvigorated preprint 
movement gain sufficient traction, 
impetus, and focus to push the 
revolution the OA movement began in a 
more desirable direction? 


social science 
without walls 


This was the dominant question in my 
mind after doing the Q&A below with Philip Cohen, founder of the new 
social sciences preprint server SocArXiv. 


Preprint servers are by no means a new phenomenon. The highly-successful 
physics preprint server arXiv (formally referred to as an e-print service) was 
founded way back in 1991, and today it hosts 1.2 million e-prints in physics, 
mathematics, computer science, quantitative biology, quantitative finance 
and statistics. Currently around 9,000-10,000 new papers each month are 
submitted to arXiv. 


Yet arXiv has tended to complement — rather than compete with — the 
legacy publishing system, with the vast majority of deposited papers 
subsequently being published in legacy journals. As such, it has not 
disrupted the status quo in ways that are necessary if the OA movement is to 
achieve its objectives — a point that has (somewhat bizarrely) at times been 
celebrated by open access advocates. 


In any case, subsequent attempts to propagate the arXiv model have 
generally proved elusive. In 2000, for instance, Elsevier launched a 
chemistry preprint server called ChemWeb, but closed it in 2003. In 2007, 
Nature launched Nature Precedings, but closed it in 2012. 


Hope springs eternal 


Fortunately, hope springs eternal in academia, and new attempts to build on 
the success of arXiv are regularly made. Notably, in 2013 Cold Spring 
Harbor Laboratory (CSHL) launched a preprint server for the biological 
sciences called bioRxiv. To the joy of preprint enthusiasts, it looks as if this 


may prove a long-term success. As of March oe 2017, some 8,850 papers 
had been posted, and the number of monthly submissions has grown to 
around 620. 


Buoyed up by bioRxiv’s success, and convinced that the widespread posting 
of preprints on the open Web has great potential for improving scholarly 
communication, last year life scientists launched the ASA Pbio initiative. 
The initial meeting was deemed so successful that the normally acerbic 
PLOS co-founder Michael Eisen penned an uncharacteristically upbeat blog 
post about it (here). 


Has something significant changed since Elsevier and Nature 
unsuccessfully sought to monetise the arXiv model. If so, what? Perhaps the 
key word here is “monetise”. We can see rising anger at the way in which 
legacy publishers have come to dominate and control open access (see here, 
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here, and here for instance), anger that has been amplified by a dawning 
realisation that the entire scholarly communication infrastructure is now in 
danger of being — in the words of Geoffrey Bilder — enclosed by private 
interests, both by commercial publishers like Elsevier, and by for-profit 
upstarts like ResearchGate and Academia.edu (see here, here and here for 
instance). 


CSHL/bioRxiv and arXiv are, by contrast, non-profit initiatives whose 
primary focus is on research, and facilitating research, not the pursuit of 
profit. Many feel that this is a more worthy and appropriate mission, and so 
should be supported. Perhaps, therefore, what has changed is that there is a 
new awareness that while legacy publishers contribute very little to the 
scholarly communication process, they nevertheless profit from it, and 
excessively at that. And for this reason they are a barrier to achieving the 
objectives of the OA movement. 


Reproducibility crisis 


But what is the case for making preprints freely available online? After all, 
the research community has always insisted that it is far preferable (and 
safer) for scholars to rely on papers that have been through the peer-review 
process, and published in respectable scholarly journals, in order to stay up 
to date in their field, not on self-deposited early versions of papers that 
might or might not go on to be published. 


Advocates for open access, however, now argue that making preprints 
widely available enables research to be shared with colleagues much more 
quickly. Moreover, they say, it enables papers to potentially be scrutinised 
by a much greater number of eyeballs than with the traditional peer review 
system. As such, they add, the published version of a paper is likely to be of 
higher quality if it has first been made available as a preprint. In addition, 
they say, posting preprints allows researchers to establish priority in their 
discoveries and ideas that much earlier. Finally, they argue, the widespread 
sharing of preprints would benefit the world at large, since it would speed 
up the entire research process and maximise the use of taxpayer money 
(which funds the research process). 


Many had assumed that OA would provide these kind of benefits. In 
addition to making papers freely available, it was assumed that open access 
would introduce a quicker time-to-publish process. This has not proved the 
case. For instance, while the peer review “lite” model pioneered by PLOS 
ONE did initially lead to faster publication times, these have subsequently 
begun to lengthen again. 


Above all, open access has failed to address the so-called reproducibility 
crisis (also referred to as the replication crisis). By utilising a more 
transparent publishing process (sometimes including open peer review) it 
was assumed that open access would increase the quality of published 
research. Unfortunately, the introduction of pay-to-publish gold OA has 
undermined this, not least because it has encouraged the emergence of so- 
called predatory OA publishers (or article brokers), who gull researchers 
into paying (or sometimes researchers willingly pay) to have their papers 
published in journals that wave papers past any review process. 


The reproducibility crisis is by no means confined to open access publishing 
(the problem is far bigger), but it could hold out the greatest hope for the 
budding preprint movement. 


Why do I say this? And what is the reproducibility crisis? Stanford 
Professor of Medicine John Ioannidis neatly summarised the reproducibility 
crisis in 2005, when he called his seminal paper on the topic “Why most 
published research findings are false”. In this and subsequent papers 
Ioannidis has consistently argued that the findings of many published papers 
are simply wrong. 


Shocked at Ioannidis’ findings, other researchers set about trying to size the 
problem and to develop solutions. In 2011, for instance, social psychologist 
Brian Nosek launched the Reproducibility Project, whose first assignment 
consisted of a collaboration of 270 contributing authors who sought to 
repeat 100 published experimental and correlational psychological studies. 
Their conclusion: only 36.1% of the studies could be replicated, and where 
they did replicate their effects were smaller than the initial studies effects, 
seemingly confirming Ioannidis’ findings. 


The Reproducibility Project has subsequently moved on to examine the 
situation in cancer biology (with similar initial results). Meanwhile, a 
survey undertaken by Nature last year would appear to confirm that there is 
a serious problem. 
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Whatever the cause and extent of the reproducibility crisis, Nosek’s work 
soon attracted the attention of John Arnold, a former Enron trader who has 
committed a large chunk of his personal fortune to funding those working to 
—as Wired puts it — “fix science’. In 2013, Arnold awarded Nosek a $5.25 
million grant to allow him and colleague Jeffrey Spies to found the Center 
for Open Science (COS). 


COS is a non-profit organisation based in Charlottesville, Virginia. Its 
mission is to “increase openness, integrity, and reproducibility of scientific 
research”. To this end, it has developed a set of tools that enable researchers 
to make their work open and transparent throughout the research cycle. So 
they can register their initial hypotheses, maintain a public log of all the 
experiments they run, and the methods and workflows they use, and then 
post their data online. And the whole process can be made open for all to 
review. 


Open Science Framework 


At the heart of the COS project is the Open Science Framework (OSF). 
This, COS executive director Brian Nosek explained to me last year, 
consists of two main components — a back-end application framework and a 
front-end view. “The back-end framework is an open-source, general set of 
tools and services that can be used to support virtually any service 
supporting the research lifecycle”, he explained, adding that the front-end is 
the interface through which researchers interact with the system. 


How will this help the preprint movement? If the objective is to make the 
entire research process open and transparent then posting preprints is clearly 
an essential part of the OSF vision. And to assist in this the Open Science 
Framework includes a module called OSF Preprints. Any researcher can 
post preprints directly into OSF Preprints. Importantly, the service also 
allows “collections” to be created. These can be collections of, say, journals, 
meetings, registries, or indeed preprints. And they can be community-based 
collections with a branded community interface. SocArXiv is one of those 
community interfaces. 


As COS Community Manager Matt Spitzer explained to me last year, 
“SocArXiv will simply be a branded service built on a generalised OSF pre- 
print service.” 


As preprint fever spreads so a growing number of communities have begun 
to follow in SocArXiv’s footsteps. In the last few months we have seen the 
emergence of PsyArXiv, AgriXiv, and engrXiv, all of which piggyback on 
OSF Preprints. And most recently the Berkeley Initiative for Transparency 
in the Social Sciences has launched BITSS Preprints. In addition, The 
Electrochemical Society (ECS) has indicated that it too plans to leverage the 
Open Science Framework to create a preprint service. 


Elsewhere, the Latin American online library and publishing platform 
SciELO has announced plans to launch a preprint service. And for those in 
the humanities the Humanities Commons has launched CORE. 


True, CORE is described as a repository, but it caters for preprints too. 
Indeed, it seems likely that we could see repositories and preprint servers 
start to merge. In the Q&A below Cohen stresses that SocArXiv is not 
intended exclusively for preprints and, as we shall see, he believes it is 
important that it should not. 


Clearly keen to play in the preprint pond, for-profits are riding the wave too. 
Both PeerJ and F1000Research now offer preprint services, although these 
are primarily intended to feed the pay-to-publish services these companies 
offer. Likewise, OA publisher MDPI has launched preprints.org, presumably 
for similar reasons. 


Finally, we could note that the American Chemical Society (ACS) has 
announced plans to launch a preprint server (ChemRxiv) too. This is ironic 
given its response to the launch of ChemWeb 17 years ago, but underlines 
how attitudes to preprints have changed. 


Central Service 


As the number of preprint servers increases, however, so concern has grown 
that the landscape could become overly complex, and inefficient. At 
ASAPbio’s third meeting, therefore, it was proposed that a central preprint 
service be created. Explaining the logic for this, ASAPbio commented “an 
increasing number of intake mechanisms ... may lead to confusion and 
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difficulty in finding preprints, heterogenous standards of ethical disclosure, 
duplication of effort in creation of infrastructure, and uncertainty of long- 
term preservation.” 


ASAPbio has already attracted $1 million in funding for the mooted Central 
Service, and since OSF Preprints could be said to contain the seeds for 
creating this — in so far as it is fast becoming the platform of choice for 
those setting up preprint servers and because, courtesy of its partnership 
with SHARE, it is already harvesting preprints from third-party servers (i.e. 
bioRxiv, arXiv, PeerJ and CogPrints) — COS is bidding to build the 
ASAPbio Central Service. 


But the million-dollar question is whether this fledgling preprint movement 
has the potential to get the OA revolution back on track, and perforce reduce 
the degree of control publishers now have over scholarly communication. 
Key to this, of course, will be whether the new services can attract sufficient 
papers to make them viable, and whether they will prove financially 
sustainable over time. Above all, however, their success will depend on 
whether they can play a meaningful role in reinventing scholarly 
communication for the networked world. 


Here we could note that in the Q&A below Cohen voices concern that 
ASAPbio envisages the Central Service as catering for preprints alone. This, 
he says, could prove “a gift to the publishers, who retain their dominance by 
controlling the so-called ‘version of record.’” 


He adds: “There is no reason to erect this barrier between systems, where 
the ‘preprints’ system only publishes non-reviewed work, and the journals 
only publish reviewed work — except to protect the revenue stream of the 

publishers.” 


Importantly, Cohen warns, fixating on “the idea of the ‘complete’ draft may 
impede innovation toward more advanced forms of communication.” 


As we noted, arXiv has done little to disrupt the legacy publishing system. 
The danger is that the new generation of preprint servers will achieve little 
more than arXiv in this regard. That is, they could become no more than 
repositories of articles jostling for a place in traditional (or pay-to-publish 
OA) journals. Already we can see journal editors seeking to position them 
as passive reservoirs of papers waiting to be selected for publishers’ pay-to- 
publish mills. 


Preprint servers have the potential to be far more than that. They should be 
viewed as nurseries in which new forms of scholarly communication are 
experimented with and developed. As such, they should be viewed as 
separate and, to a great extent, independent of the legacy journal system. 
One hopeful sign here is that we have seen the emergence of new overlay 
journals like Discrete Analysis and Quantum. Built on top of arXiv these 
tend to be scholar-led, community owned journals created and managed to 
review, highlight, and disseminate high-quality research papers, not to 
monetise them. As such, they can be seen as alternatives rather than 
complements to the traditional system (and its oligopolists). 


Complete the revolution? 


Evidently, Cohen would like to see SocArXiv play a similar role. When I 
asked him if he envisaged the service adding comment and post-publication 
review functionality, or becoming a platform for new overlay journals he 
replied, “[I]t’s important to point out that, as an open service, it is possible 
right now for anyone to develop those functions. Any institution, working 
group, department, or library could put up a list of papers, automatically or 
manually generated, and host discussions on them, facilitate peer review, 
and produce their own overlay journals. A big part of our outreach job in the 
coming year is to get people who have the knowhow and resources to 
develop such things to jump on it and bring them to fruition.” 


Cohen clearly also has his eye on a world beyond the traditional journal. 
Writing on the LSE blog last year he said, “I hope that SocArXiv will 
enable us to save research from the journal system.” 


And below he points out that scholarly work involves far more than journal 
articles, not least data and commentary. “SocArXiv does not require the 
disruption of the journal system, but if we help make that happen, and help 
build a better system to replace it, I would be glad.” 


The good news is that if the preprint movement flourishes, and manages to 
maintain an existence independent of traditional publishers, it has the 
potential to complete the revolution the OA movement began. And if all else 
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fails, it could seek to cut publishers out of the loop altogether and take back 
ownership of scholarly communication. 


Alternatively, of course, it may — like the OA movement more generally — 
end up captured and exploited by legacy publishers, who will seek to use it 
in a way that props up the outdated and inefficient model of scholarly 
communication that currently allows them to make excessive profits from 
the public purse. Not only would this be a waste of taxpayers’ money, but it 
would hobble and hold back the global research endeavour. 


The interview begins ... 


RP: What is SocArXiv, who should use it, and 
why? 


PC: SocArXiv is an open archive of the social 
sciences, a free, noncommercial service for rapid 
sharing of academic papers. It is built on the 
Open Science Framework, an open access, open 
source platform that also allows researchers to 
upload entire projects (e.g., data and code) and 
link them to research results. 


Anyone who does research in the social sciences 

should consider using it. Because SocArXiv is a 
not-for-profit alternative, researchers can be assured that they are sharing 
their research in an environment where access, inclusivity, and preservation, 
rather than profit, will remain at the heart of the mission. 


All this is in contrast to the for-profit companies that want to monetize your 
research, including Academia.edu, ResearchGate, the Elsevier products 
Mendeley and SSRN, and Google Scholar. They may or may not provide 
people with something useful — access, storage, social networking, metrics — 
but they exist to make money for their investors, and that’s not our mission. 


RP: How is the service managed and by who? 


PC: SocArXiv is administratively housed at the University of Maryland, 
under my direction, with a steering committee of sociologists and academic 
librarians. That means that our grant money is administered by UMD, and 
we receive tax deductible contributions through the university’s foundation. 


In our operations we are a partner of the nonprofit Center for Open Science 
(COS), which built and operates the archive. As a member community of 
the COS Preprints service, we participate in their Advisory Group, which 
consults on questions of governance and technology. 


RP: As you indicated, the SocArXiv steering committee is heavy on 
sociologists. Does that tell us anything beyond the fact that you are a 
sociologist and so presumably reached out to your colleagues in the first 
instance? 


PC: That’s correct. Being a small operation, it helped to start with people in 
one discipline as a way to organize our discussion of needs and desires — 
what we want, and how can we make it happen. 


Of course, our needs and desires are very similar to those of people in other 
disciplines, but it helped to think locally. The system is open to all 
disciplines — anyone who wants their work to appear under the words 
“social science” (we have a number of papers, for example, from 
anthropology, geography, and urban planning). It’s also important that the 
sociologists on the committee include experts in such subjects as the 
sociology of knowledge, organizations, social movements, and higher 
education. 


Beyond our researchers, by working with leaders of the academic library 
community as well, we are developing the project on a foundation of good 
preservation, access, and public service — and lots of experience managing 
information projects. Additionally, as we gain institutional supporters we are 
including them on a consultative advisory board. 


RP: Are social scientists more or less likely to embrace open access and 
preprint servers than other disciplines? What are the discipline-specific 
issues here, and are there any disincentives for social scientists to use a 
service like SocArXiv? 


PC: I can’t generalize to social science in general, but some patterns are 
clear. For example, economists are used to reading important work online 


before it’s peer reviewed, and they have high-status outlets for working 
papers that are recognized outside of academia — as when major news 
organizations report on NBER Working Papers. 


Sociologists, on the other hand, expect to hear about interesting research 
first at a conference — where they will see slides but not have access to a 
paper — and then wait months (or years) to read it in a peer reviewed 
journal. I use that example purposefully, because it also correlates with the 
massive disparity in social and political influence between economics and 
sociology. 


RP: How receptive are social science journals to accepting papers that 
have been on a preprint server? Is there an issue here? 


PC: I don’t know of any major social science journal that will not accept 
papers that have been posted in a public repository. The American 
Sociological Association, for example, although it has a bad track record of 
operating for-profit journals and discouraging open access, explicitly 
permits publication in all of its journals of papers that have been posted in 
non-peer-reviewed repositories. 


RP: We last spoke in July 2016. What has changed since then, and is the 
service proving more or less popular/successful than you anticipated? 


PC: We have made great strides since our soft launch last summer as the 
first community in the OSF Preprints service. In December COS launched a 
more fully featured web interface for uploading and discovering papers, and 
several other communities have started up (in agriculture, psychology, 
engineering, and research transparency). All the papers from these services 
become part of the same open system. 


As COS is leading on the technology, we have been concentrating on the 
scholar and community side. We have received grants of $50,000 each from 
the Alfred P. Sloan Foundation, and the Open Society Foundations, and 
contributions of $10,000 each from two libraries (UCLA and MIT). At the 
University of Maryland, we have received support from the Department of 
Sociology, the College of Behavioral and Social Sciences, and the 
University Libraries. We are using this money for outreach and 
development, to build the user base and expand the community, and to bring 
people together to work on next steps. 


To that end, this year we will hold a symposium called 03S: Open 
Scholarship for the Social Sciences, on the UMD campus October 26-27. 
We hope it will be the first in a series of conferences, and we will feature 
panels showcasing open scholarship, research on open scholarship, and a 
workshop on the future of SocArXiv. With keynote addresses by COS co- 
founder and CTO Jeffrey Spies and sociologist Tressie McMillan 

Cottom we think this is going to be a great event. And we will have some 
funding to bring junior scholars to the symposium. (The call for papers and 
more information is available on our blog site, SocOpen.org.) 


Meanwhile, new people are posting papers every day. At this writing we 
recently passed 800 papers, posting at a rate of several per day. March looks 
great, starting off at double the rate of the previous two months. Of course, 
this is an infinitesimal fraction of the social science coming out. I had 
naively thought we would grow faster. 


The users remain concentrated among people who use Twitter and people 
who are motivated to move their papers over from the corporate paper sites. 
So there is lots of room for growth, and outreach is the watchword. 


New scholarly communication system 


RP: Preprint servers seem to be enjoying a new lease of life, particularly 
in the wake of the launch of bioRxiv and the ASAPbio initiative. Most 
recently, we have seen announcements for new preprint servers from 
SciELO and ECS. Do you see SocArXiv as part of a new movement? If so, 
how would you characterise the nature and the goals of this movement? 


PC: Preprints are a good workaround for our highly dysfunctional journal 
publishing system. With preprints you can get your work out in a timely 
way, to actual readers, while preserving your ability to publish in regular 
journals for prestige and promotion. Lots of credit to the big idea from 
arXiv.org, which started this for math and physics decades ago. They have 
preserved their journal system while enhancing the efficacy and efficiency 
of their research. 


This is what we want to do for the social sciences in the near term, while 
participating in the broader interdisciplinary movement to build a new 
scholarly communication system over time. 


RP: We have also seen a recent call for a central preprint service. Some 
have expressed doubts about this. For instance, quantum physicist 
Michael Nielsen commented “it creates an effective monopoly, which 
tends to suppress innovation”. On the other hand, the institutional 
repository movement has demonstrated that creating an effective 
distributed system faces its own kind of challenges. What are your views 
on the need for a central service, and the pros and cons of central vs. 
distributed services? Would a central service be competitive with subject- 
specific preprint servers like SocArXiv in your view, or complementary? 


PC: I have positive and negative responses to the central preprint service. 
On the positive side, I reject the fear that a central service will be a 
monopoly and suppress innovation. This shows a fundamental 
misunderstanding of open systems. If they are really open, they can’t be 
monopolies, because they present no obstacles to entry or innovation. You 
can’t start a petroleum or journal publishing company today because Exxon 
or Elsevier will crush you in the marketplace — you need to take sales away 
from them to succeed, and they will sell what you are selling for less, 
preventing you from getting started. Truly open scholarship is not like that. 
Anyone can distribute the information however they want without taking it 
away from anyone else. 


Of course there is competition in open scholarship — for attention, for grant 
money, for legitimacy — but it is not like actual market competition because 
the products are free and unlimited copies. The idea that ASAPbio or COS 

is dominant like Exxon or Elsevier is dominant is just very naive about the 

power of global capitalism. 


Seriously, Elsevier is making billions of dollars off a premodern publishing 
system that no one in their right mind would have designed this way half a 
century ago. That’s suppressing innovation. COS is the size of a thumb 
drive to them; it could be a thousand times bigger without posing the threat 
to innovation that they do. On the contrary, beyond their own innovation, 
open platforms like the OSF encourage innovation by others because 
anyone can build integrations and applications on top of them. 


And that brings me to my negative response. ASAPbio intends the Central 
Service to include only preprints, which they define as, “Complete and 
public drafts of scientific documents, yet to be certified by peer review.” I 
believe this definition — which preserves the journal article as the unit of 
scholarly output — is limiting in two ways. 


First, by insisting that preprints are not yet peer reviewed, it is a gift to the 
publishers, who retain their dominance by controlling the so-called “version 
of record.” There is no reason to erect this barrier between systems, where 
the “preprints” system only publishes non-reviewed work, and the journals 
only publish reviewed work — except to protect the revenue stream of the 
publishers. 


Second, the idea of the “complete” draft may impede innovation toward 
more advanced forms of communication. Of course that is how most 
researchers in the journal disciplines work today, but a more innovative 
future is within our grasp. 


In real life, today, scholarly work includes registrations, code, data, 
comments, and reviews themselves — but we usually only count published 
papers. Work does not stop when a draft is “complete.” Just yesterday I had 
the very common, frustrating experience of flipping back and forth between 
two papers by the same research team, produced in series, with the second 
building only very slightly off the first. The team was spinning out small 
bits of “complete” research in rapid succession, to publish them as quickly 
as possible — and maximize the lines on their CVs. 


If scholarly communication were allowed to break out of the journal article 
mode, they could simply have rolled out sequential analyses along a 
research path. The peer review system that accompanies such innovation 
would be more efficient and — if it were conducted according to open 
scholarship principles — more informative and engaging, with reviews of 
different components of the research ideally provided as context to readers 
and researchers alike as the project evolves. 


This is just one scenario, used to illustrate the possibilities for genuine 
innovation outside the relatively ancient and hidebound paper system. Post- 
publication review may turn out to be great, and I’m worried that a narrow 
definition of preprints will hinder that potential development. 


For what it is worth, although we are on a system called “OSF Preprints,” 
SocArXiv invites people to post working papers (drafts in progress), 
preprints (things to be published), and postprints (things already published 
elsewhere), as long as the author has the right to distribute them. We see no 
reason to impose limits to one or another of these categories. 


Clearly, the norms and practices associated with emerging scholarly 
communication systems are yet to be established. We want to develop new 
ideas while also allowing people to get jobs, get promoted, and use peer 
review to maintain standards of quality — all at higher speed and reduced 
cost — and we think we’re off to a great start at doing that. 


One final point on the Central Service: I’m excited by the proposal from the 
Center for Open Science, in response to the Request For Applications. In 
addition to an exemplary model of community governance, great 
technology, and a demonstrated commitment to open science principles in 
so many ways, COS offers the prospect of a preprint system that ties in to a 
wider set of tools and materials, which — while meeting the requirements of 
the RFA — might allow the system that evolves to be less constraining that 
I’m afraid it might otherwise be. I don’t know who the other contenders are, 
but I’d love to see COS build it. 


RP: As SocArXiv will be using the Center for Open Science platform it 
will be linked into SHARE. What does SHARE bring to the party? 
Presumably its function is as a discovery service only, since its currency is 
metadata rather than full-text, right? The OAI-PMH harvesting protocol 
that the IR movement developed was based on metadata, but has not 
really been that successful. What are your thoughts on these matters? 


PC: What SHARE brings to SocArXiv is the same thing it brings to all of 
the 150 data sources it currently aggregates, from the giant arXiv and 
PubMed Central to smaller individual institutional repositories and 
SocArXiv. SHARE is not designed to be a discovery platform in and of 
itself; it harvests, normalizes, and then distributes a dataset of research 
events, which include the posting of preprints. 


Through SHARE, the Association of Research Libraries and COS provide 
public infrastructure for disseminating metadata for any purpose. SHARE 
provides great opportunities for SocArXiv, allowing people to create custom 
research streams, institutional reports, discovery tools, and anything else 
you can do with research metadata. 


As a rudimentary example, I myself (knowing next to nothing about such 
things) built a Twitter feed for SocArXiv papers using SHARE 
(@socarxivpapers), which I described on our blog. 


Someone who knew what they were doing could do a lot more, and we’re 
excited to make that possible. (I am not dodging the question of OAI-PMH, 
it’s just beyond my expertise to comment on that.) 


RP: You mentioned data and software code earlier. SocArXiv acts as a 
repository for these too? 


PC: Yes. SocArXiv and the other services on OSF Preprints run on the 
Open Science Framework. Preprints may be nested within projects on that 
platform, and include any research materials. 


This is a very powerful and flexible platform, which includes storage, 
researcher collaboration tools, versioning, analytics, variable public access 
settings, and the ability to mint DOIs. This is a great benefit of working 
with COS, which is providing this application framework as a free public 
good. 


Copyright 


RP: Last July, The Scholarly Kitchen gave you a hard time over whether 
uploads to SocArXiv are vetted, and suggested that without moderation 
the service will have a problem with regard to copyright infringement. 
What is the current situation, and who is responsible if a paper uploaded 
to the service infringes someone’s copyright? Likewise, how are nonsense, 
off-topic and inappropriate papers filtered out (are they)? 


PC: Our mission is to provide access, not to police copyright. All SocArXiv 
users agree to the COS terms of use, which, in accordance with the Digital 
Millennium Copyright Act, offers a means of complaining if anyone thinks 
something has been posted in violation of their copyright. 


To my knowledge we have yet to receive such a complaint. Maybe 
Scholarly Kitchen thinks everyone has a moral obligation to play the role of 
copyright police. This is not our job. Although we will of course comply 
with the law, as noted, we’re not raising and spending money and recruiting 
volunteers to devote to the prevention of minor copyright infractions. 


In my experience, most authors have no idea what’s in the ridiculous 
contracts they sign, and they often veer between exaggerated paranoia and 
reckless egalitarianism when it comes to sharing their work. 


Often, we get the worst of both worlds. For example, I learned from your 
tweet of a new (paywalled) study finding that 40% of papers on 
ResearchGate were in violation of publisher copyrights. This is a case when 
researchers are stealing their own work from Elsevier (and others) and then 
giving it to ResearchGate to sell, for which the researcher receives nothing. 
Congratulations, academic freedom! As I wrote about Sci-Hub, “if your 
entire enterprise can be brought down by the insertion of 11 characters into 
a URL, your system may in fact not be sustainable.” 


On the question of moderation and quality control, at present papers are not 
vetted before they are posted. We manually take down the very few things 
that are obviously inappropriate. This works when you’ re taking in a few 
papers a day, but obviously we will need a more robust moderation system 
as the service grows, including clear guidelines and a routine plagiarism 
check. 


It is our hope that we can persuade researchers to reallocate some of the 
time they currently donate as reviewers in the service of monopolistic for- 
profit companies to our public-good project, and volunteer to work as 
moderators (as arXiv has done). COS is currently developing the 
moderation dashboard we will need to carry this out. 


That said, I personally think it would be good for us to get beyond the fear 
of having our work contaminated by the proximity to work of lesser quality 
(or elevated by the esteemed contributions of others, for that matter). It is 
different when people discover books by browsing shelves; in that case it’s 
a shame to have bad books getting in your way. But with a free digital 
archive the downside to accepting bad work is not so great. 


We expect people will mostly find specific research on SocArXiv through, 
for example, published citations, the recommendations of colleagues, 
through aggregations created by subject experts, from institutional lists, 
conference programs, and social media. 


We also hope to provide tools such as lists of most-read, most-cited, most- 
favourably reviewed, and so on (or these may be developed by third 
parties). Most mathematicians don’t read raw feeds from arXiv, and we 
don’t think that’s how people will use SocArXiv either. 


I think we will be able to surface great work without requiring all 
submissions to be of high quality, with all the energy and expense that 
would entail. We encourage people to brag not about the existence of their 
paper on SocArxXiv, but rather about its value. 


RP: Where does SocArXiv fit with the larger agenda that I think you refer 
to as “open scholarship”? Where does open begin and end so far as 
research in social sciences is concerned? 


PC: To clarify, when we say “open scholarship,” we are aligning with the 
open science movement, but including those who don’t consider their work 
to be “science.” The open approach responds to many of the problems we 
face in the research community today, including the long run issues in 
academia generally and the current crisis associated with the Trump 
presidency. 


The SocArXiv steering committee just posted a statement in response to the 
planned March for Science, titled “Social Science without Walls,” which 
summarizes our view on this question. In it we argue that SocArXiv will 
help us realize our collective goals of making our work better, more 
efficient, more relevant, and less hierarchical. 


The social science without walls made possible by open source, open access 
research infrastructure, we wrote, “allows us to make the best use of our 
resources, improve the process and products of our work, bring it to more 
people faster, and dissolve the obstacles to interaction that plague our 
industry.” 


From the research process itself through dissemination of results and — 
crucially, today especially — engagement with wider publics, open 


scholarship is foundational to our vision of social science. 


RP: I believe you are of the view that the research community needs to 
take back control of scholarly publishing. What does that mean in 
practice? Does it mean, for instance, you believe traditional publishers no 
longer have a valid role in scholarly communication? And how does 
SocArXiv facilitate the process of taking back control? 


PC: Most of what commercial journal publishers do academics actually do. 
We research, write, review, edit, and promote our work — and commercial 
publishers organize that labour, partly to our benefit and the public’s benefit 
but largely to their own. Some of what they do is outside of our expertise, 
including editing and producing publications, but those functions are 
secondary. And a lot of what they do is only necessary to serve the needs of 
the system they rely on, such as marketing and policing copyrights and 
devising means of keeping content from reaching readers. 


An open access scholarly publishing system could do more, faster, better, 
and vastly cheaper, without most of what commercial publishers do. 
SocArXiv does not require the disruption of the journal system, but if we 
help make that happen, and help build a better system to replace it, I would 
be glad. 


Funding and the future 


RP: You mentioned that SocArXiv has already attracted some funding. 
Can you say more about funding and how it can be assured over time? 
How successful have you been to date in your funding efforts? Can you 
envisage the service ever offering paid-for services in order to be 
financially sustainable? If so, what kind of services, and whom would you 
expect to be billed? 


PC: The operation of the archive is funded by COS at present. The grant 
money and institutional contributions we have so far are going to design and 
outreach and governance efforts. I hope we will be able to continue building 
the system with money from foundations such as those that support us now, 
as we develop a model of sustainability that derives support from the 
voluntary contributions of academic institutions and research funders. 


I have been inspired by arXiv’s model (and they have graciously consulted 
with us, in addition to letting us riff off their name), and I hope that we can 
follow in their footsteps on sustainability as well. 


We are committed to offering a free service for researchers and readers, and 
open access indefinitely. We might in principle offer ancillary services to 
institutions for a fee, but we have as yet no such plans. (Note to institutional 
readers: if you are currently paying SSRN thousands of dollars per year for 
a paper series or a list of papers, contact us!) 


RP: To what extent is the SocArXiv project focused on advocacy as much 
as service provision? More generally, is there a danger that the preprint 
movement might end up chasing after buzzwords and trends, rather than 
sparking fundamental change in scholarly communication (which I think 
has been a tendency within the OA movement)? 


PC: We have to do some of both — advocacy and service provision — but 
ultimately I hope our service will be our advocacy. I can write polemics all 
day long (and I often do), but in the absence of a working open archive they 
won’t mean that much. 


Participation in the archive is not conditional on some political or social 
movement affiliation. At our most ambitious we do want to shift the ground 
on which social science is built, but that’s going to require offering 
something new and professionally rewarding beyond a cutting critique of 
Elsevier. 


RP: What future plans are there for SocArXiv? I have seen mention of a 
comment function, post-publication review, overlay journals? Are these 
all on the table? What other features/functionality do you anticipate 
offering in the future? 


PC: Those are all potentially important features, although not as important 
as a smoothly operating basic archive, with transparent governance, shared 
norms, and community support — so I’m not rushing. 


However, it’s important to point out that, as an open service, it is possible 
right now for anyone to develop those functions. Any institution, working 
group, department, or library could put up a list of papers, automatically or 


manually generated, and host discussions on them, facilitate peer review, 
and produce their own overlay journals. A big part of our outreach job in the 
coming year is to get people who have the knowhow and resources to 
develop such things to jump on it and bring them to fruition. 


I especially want to encourage people who are already in the business of 
aggregating papers — such as conferences and paper competitions — to use 
the system. Anyone running a paper competition could require the papers be 
posted on SocArXiv, where they could be juried as they are made public. 


Similarly, conference submissions could be done through the archive, with 
papers tagged according to their panel sessions or subject areas. These are 
simple examples of how we could do work we are already doing but in an 
open way, using the tools SocArXiv already has made available to move 
toward an open scholarship culture. 


RP: What are the primary obstacles today to achieving the changes you 
would like to see to scholarly communication, how can they be overcome, 
and what long-term opportunities does the open agenda offer the research 
community? 


PC: You may have meant practical obstacles, but all these words later I’m 
inclined toward a more philosophical answer. To my mind, our biggest 
obstacles are institutional inertia and risk aversion. 


No reasonable person would design an academic publishing system like this 
if we were building it today. When I was a grad student in the early 1990s, 
before the web, we had to physically be in the library to read the journals (I 
did not subscribe to any). Now that we have the capacity to provide them to 
anyone anywhere at a fraction of the cost, are they any more accessible? 


The great innovations in journal publishing technology in the last quarter 
century seem to have gone to building and maintaining elaborate paywall 
and authentication systems, and legal protocols to enforce them — and more 
is spent keeping people out than bringing people in. 


The American Sociological Association, in my own discipline, still allocates 
“pages” to journal editors according to the cost of printing and shipping 
paper, setting an arbitrary limit to how many “top” articles may exist. 
Fearing a future in which “the journal world may not be as profitable in the 
future as it is now,” ASA’s response is to work on inventing new paywall 
journals. 


Inertia is normal for social institutions, of course, but journal publishing 
seems to have more than most. I’m sure this comes from the slow turnover 
of generations in academia, and from the constricting job market that 
compels professors to squeeze harder to make students in their own image, 
out of fear of joint failure. 


That’s probably also why they fight so hard to maintain our arbitrary 
prestige and ranking system, which bestows success or failure on scholars 
before anyone beyond a tiny committee of reviewers has laid eyes on their 
actual work, much less assessed its impact. 


There is also big money at stake. But it’s not just executives and managers 
of the multinational conglomerates that sit atop the system, it’s also the 
conferences and receptions and awards (and tote bags) they dole out, for 
which the vast majority of faculty continuously scrap. 


We could do so much better for so much less money. 


But there are risks. We have to be willing to try new things, to step out from 
under the current system. We have to evaluate people not based on the 
pedigree of their journal publications but on the quality of their work. We 
have to reward career pathways that differ from the ones that got us where 
we ate. 


Some attempts will fail. But if we’re guided by sound principles, focus on 
what’s important, and play to our strengths — doing the things we do well 
and contracting for the things we don’t — the rewards will be greater down 
the road. And that’s what the open agenda offers. 


RP: Thank you for taking the time to answer my questions 


Posted by Richard Poynder at 41 
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